skip to main content


Search for: All records

Creators/Authors contains: "Baker, Ryan"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. In past work, time management interventions involving prompts, alerts, and planning tools have successfully nudged students in online courses, leading to higher engagement and improved performance. However, few studies have investigated the effectiveness of these interventions over time, understanding if the effectiveness maintains or changes based on dosage (i.e., how often an intervention is provided). In the current study, we conducted a randomized controlled trial to test if the effect of a time management intervention changes over repeated use. Students at an online computer science course were randomly assigned to receive interventions based on two schedules (i.e., high-dosage vs. low-dosage). We ran a two-way mixed ANOVA, comparing students' assignment start time and performance across several weeks. Unexpectedly, we did not find a significant main effect from the use of the intervention, nor was there an interaction effect between the use of the intervention and week of the course. 
    more » « less
    Free, publicly-accessible full text available July 20, 2024
  2. The General Data Protection Regulation (GDPR) in the European Union contains directions on how user data may be collected, stored, and when it must be deleted. As similar legislation is developed around the globe, there is the potential for repercussions across multiple fields of research, including educational data mining (EDM). Over the past two decades, the EDM community has taken consistent steps to protect learner privacy within our research, whilst pursuing goals that will benefit their learning. However, recent privacy legislation may cause our practices to need to change. The right to be forgotten states that users have the right to request that all their data (including deidentified data generated by them) be removed. In this paper, we discuss the potential challenges of this legislation for EDM research, including impacts on Open Science practices, Data Modeling, and Data sharing. We also consider changes to EDM best practices that may aid compliance with this new legislation. 
    more » « less
    Free, publicly-accessible full text available July 5, 2024
  3. Massive Open Online Courses (MOOCs) have increased the accessibility of quality educational content to a broader audience across a global network. They provide access for students to material that would be difficult to obtain locally, and an abundance of data for educational researchers. Despite the international reach of MOOCs, however, the majority of MOOC research does not account for demographic differences relating to the learners' country of origin or cultural background, which have been shown to have implications on the robustness of predictive models and interventions. This paper presents an exploration into the role of nation-level metrics of culture, happiness, wealth, and size on the generalizability of completion prediction models across countries. The findings indicate that various dimensions of culture are predictive of cross-country model generalizability. Specifically, learners from indulgent, collectivist, uncertainty-accepting, or short-term oriented, countries produce more generalizable predictive models of learner completion. 
    more » « less
    Free, publicly-accessible full text available July 20, 2024
  4. Students who take an online course, such as a MOOC, use the course's discussion forum to ask questions or reach out to instructors when encountering an issue. However, reading and responding to students' questions is difficult to scale because of the time needed to consider each message. As a result, critical issues may be left unresolved, and students may lose the motivation to continue in the course. To help address this problem, we build predictive models that automatically determine the urgency of each forum post, so that these posts can be brought to instructors' attention. This paper goes beyond previous work by predicting not just a binary decision cut-off but a post's level of urgency on a 7-point scale. First, we train and cross-validate several models on an original data set of 3,503 posts from MOOCs at University of Pennsylvania. Second, to determine the generalizability of our models, we test their performance on a separate, previously published data set of 29,604 posts from MOOCs at Stanford University. While the previous work on post urgency used only one data set, we evaluated the prediction across different data sets and courses. The best-performing model was a support vector regressor trained on the Universal Sentence Encoder embeddings of the posts, achieving an RMSE of 1.1 on the training set and 1.4 on the test set. Understanding the urgency of forum posts enables instructors to focus their time more effectively and, as a result, better support student learning. 
    more » « less
    Free, publicly-accessible full text available July 5, 2024
  5. We introduce a novel technique for automatically summarizing lecture videos using large language models such as GPT-3 and we present a user study investigating the effects on the studying experience when automatic summaries are added to lecture videos. We test students under different conditions and find that the students who are shown a summary next to a lecture video perform better on quizzes designed to test the course materials than the students who have access only to the video or the summary. Our findings suggest that adding automatic summaries to lecture videos enhances the learning experience. Qualitatively, students preferred summaries when studying under time constraints. 
    more » « less
    Free, publicly-accessible full text available July 1, 2024
  6. Threshold concepts are transformative elements of domain knowledge that enable those who attain them to engage domain tasks in a more sophisticated way. Existing research tends to focus on the identification of threshold concepts within undergraduate curricula as challenging concepts that prevent attainment of subsequent content until mastered. Recently, threshold concepts have likewise become a research focus at the level of doctoral studies. However, such research faces several limitations. First, the generalizability of findings in past research has been limited due to the relatively small numbers of participants in available studies. Second, it is not clear which specific skills are contingent upon mastery of identified threshold concepts, making it difficult to identify appropriate times for possible intervention. Third, threshold concepts observed across disciplines may or may not mask important nuances that apply within specific disciplinary contexts. The current study therefore employs a novel Bayesian knowledge tracing (BKT) approach to identify possible threshold concepts using a large data set from the biological sciences. Using rubric-scored samples of doctoral students’ sole-authored scholarly writing, we apply BKT as a strategy to identify potential threshold concepts by examining the ability of performance scores for specific research skills to predict score gains on other research skills. Findings demonstrate the effectiveness of this strategy, as well as convergence between results of the current study and more conventional, qualitative results identifying threshold concepts at the doctoral level. 
    more » « less
  7. Research into "gaming the system" behavior in intelligent tutoring systems (ITS) has been around for almost two decades, and detection has been developed for many ITSs. Machine learning models can detect this behavior in both real-time and in historical data. However, intelligent tutoring system designs often change over time, in terms of the design of the student interface, assessment models, and data collection log schemas. Can gaming detectors still be trusted, a decade or more after they are developed? In this research, we evaluate the robustness/degradation of gaming detectors when trained on old data logs and evaluated on current data logs. We demonstrate that some machine learning models developed using past data are still able to predict gaming behavior from student data collected 16 years later, but that there is considerable variance in how well different algorithms perform over time. We demonstrate that a classic decision tree algorithm maintained its performance while more contemporary algorithms struggled to transfer to new data, even though they exhibited better performance on both new and old data alone. Examining the feature importances provides some explanation for the differences in performance between models, and offers some insight into how we might safeguard against detector rot over time. 
    more » « less